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Sparse Index Tracking Based On L1/2 Model And 

Algorithm 

Xu Fengmin 0 Zongben Xiil Honggang Xud 


Abstract. Recently, Li regularization have been attracted extensive attention and 
successfully applied in mean-variance portfolio selection for promoting out-of-sample 
properties and decreasing transaction costs. However, Li regularization approach 
is ineffective in promoting sparsity and selecting regularization parameter on index 
tracking with the budget and no-short selling constraints, since the 1 -norm of the asset 
weights will have a constant value of one. Our recent research on L1/2 regularization 
has found that the half thresholding algorithm with optimal regularization parameter 
setting strategy is the fast solver of L1/2 regularization, which can provide the more 
sparse solution. In this paper we apply L1/2 regularization method to stock index 
tracking and establish a new sparse index tracking model. A hybrid half thresholding 
algorithm is proposed for solving the model. Empirical tests of model and algorithm 
are carried out on the eight data sets from OR-library. The optimal tracking portfolio 
obtained from the new model and algorithm has lower out-of-sample prediction error 
and consistency both in-sample and out-of-sample. Moreover, since the automatic 
regularization parameters are selected for the hxed number of optimal portfolio, our 
algorithm is a fast solver, especially for the large scale problem. 

Keywords: Index tracking; L1/2 regularization; Half thresholding algorithm. 
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1 Introduction 

Stock index derivatives, such as index funds, index futures, index options etc, have 
developed very rapidly and become important tools in investment and risk manage- 
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ment of global financial markets, especially it shows the better effects to stabilize the 
stock market in global hance crisis. Index tracking (e.g., index replication) plays a 
core role in prodnct design and risk management of index derivatives. It consists in 
constrnction of a tracking portfolio whose behavior is as similar as possible to target 
index dnring a predehned period. 

Broadly speaking, two different strategies can be used to track a given stock 
market index: the full replication and the non-full replication. The full replication 
consists in purchasing all constituent stocks of a given index. In practice, this strategy 
need high transaction costs. An alternative way is the non-full replication, which 
include the stratihed sampling replication and the optimal replication. Since the 
selection of the stocks in stratihed sampling replication depends on the manager’s 
experience, so the tracking portfolio is non-optimal, thus we focus on the optimal 
replication method in this paper. The optimal replication aims to hnd the portfolio 
that minimizes the tracking error by investing in only a subset of the assets using 
optimization method. This strategy involves much lower transaction costs, and can 
achieve acceptable tracking errors in principle. 

Different quantitative methods have been proposed to tackle such an optimiza¬ 
tion problem. Roll establishes optimal index tracking models and proposed a mean- 
variance analysis of index tracking on Markowitz’s earlier study [25]. Tabata and 
Takeda discuss the index fund management based on mean-variance model [29] . Buck- 
ley and Korn apply optimal impulse control techniques to the index tracking problem 
with hxed and proportional transaction costs [6] . Rudolf et ah propose several piece- 
wise linear measures of the tracking error, and solve the problem by means of linear 
programming [26]. Alexander proposes the construction of tracking portfolios by an¬ 
alyzing the coincidental structure between the time series of each of the assets and 
the time series of the tracked index [T]. Ammann and Zimmermann investigate the 
relationship between several statistical measures of tracking error and asset alloca¬ 
tion restrictions based on admissible weight ranges [2]. Gilli and Kellezi propose the 
use of the threshold accepting heuristic to solve the problem, including cardinality 
restrictions and transaction costs [IB] . Beasley et ah address the index tracking prob¬ 
lem using evolutionary heuristics with real-valued chromosome representations [1] . 
Lobo et ah investigate the portfolio optimization problem with transaction costs, 
which they address by means of a heuristic relaxation method that consists in solving 
a small number of convex optimization problems using hxed transaction costs [22] . 
Torrubiano and Beasley present a nonlinear mixed-integer optimal model and a cor¬ 
responding algorithm for index tracking [7]. Torrubian et al. design a hybrid strategy 
that combines an evolutionary algorithm with quadratic programming to yield the 
optimal tracking portfolio that invests only in the selected assets [30] . 

On the other hand, statistical regularization methods have been successfully ap¬ 
plied in mean-variance portfolio selection in order to promote the identihcation of 
sparse portfolios with good out-of-sample properties and low transaction costs dUEl 
[TB] . DeMiguel et al focuses on the effect of the constraints on the covariance regu¬ 
larization, a technique extension of the result in Jagannathan and Ma [20]. Brodie 
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et al emphasize on the sparsity of the portfolio allocation and the optimization al¬ 
gorithms by using the LASSO (Li regularization |3T]), they also noted that the idea 
using Li regularization can be used to solve the index tracking problem with short 
selling constraints. Prominent contribution of Fan Jianqing et al is to provide math¬ 
ematical insights to the utility approximations with the gross-exposure constraint. 
These proposed approaches rely on imposing upper bounds on the 2-norm of the 
portfolio weights as suggested by the ridge regression(L 2 regularization [I9]), or on 
the 1-norm using Li regularization approach. Empirical results in a mean-variance 
framework support the use of the Li regularization method when short selling is al¬ 
lowed. However, the LASSO approach is ineffective in promoting sparsity in presence 
of the budget and no-short selling constraints. 

Consider the index tracking problem, the budget and no-short selling constraints 
is essential. If we use the Li regularization to deal with the index tracking problem, 
there will be some defects. First, the Li regularization can’t provide the more sparse 
optimal solution since the 1-norm of the asset weights will have a constant value 
of one; Second, the selection of regularization parameter is a hard problem for Li 
regularization since the number of the optimal tracking portfolio is hxed; Finally, 
the optimization strategy to deal with the constrained Li regularization is to use the 
penalty function method, the penalty factor is more difficult to select. 

Fortunately, our recent studies on L 1/2 regularization have found that L 1/2 regu¬ 
larization can overcome these defects of Li regularization [321 ESI [M]. The reasons 
are as follows. Firstly, using L 1/2 regularization get the more sparse tracking portfolio 
than Li regularization ini, that is we can use the least stocks to track the target in¬ 
dex by controlling the turnover; Secondly, though L 1/2 regularization is a nonconvex, 
non-smooth and non-Lipschitz optimization problem, we derive the fast and effective 
half thresholdig algorithm for solution of L 1/2 regularization, especially for large-scale 
problems [3l]. Finally, For decreasing transaction costs and easy to manage portfo¬ 
lio, managers often request a sparse tracking portfolio with hxed K stocks to track 
the object index when index has a large number of constituents. For iF-sparsity in¬ 
dex tracking problem, the regularized parameter of half thresholding algorithm can 
automatic correct to appropriate value whatever the initial condition is. 

Base on the above analysis. The main work in this paper is to design a sparse 
index tracking model and algorithm by introducing L 1/2 regularization. Different to 
focus on hnding the portfolio that is optimal using as inputs the recent historical 
evolution of the assets, we are interested in the future tracking performance of the 
portfolio. In section 2, we briehy review the index tracking model in [30] and L 1/2 
regularization with half thresholding algorithm. A sparse index tracking model with a 
hybrid half thresholding algorithm based on L 1/2 regularization and half thresholding 
algorithm is derived in section 3. Empirical comparisons are conducted in section 4. 
The data is partitioned into training data and testing data, the training data is used 
to construct the optimal tracking portfolio investing in a subset of the index assets. 
The performance of this tracking portfolio is then evaluated not only on the in-sample 
data, but also on the out-of-sample data. So the optimal tracking portfolio of sparse 
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index tracking model that are snboptimal on the training data can have a better 
ont-of-sample performance on the test data. Meanwhile, we dehne the consistence 
indicator to discnss the performance of the new model both in-sample and the ont- 
of-sample, the empirical comparison in section 4 will illnstrate these resnlts. We 
conclnde the paper in section 5. 

2 Preliminaries 

To give a precise formnlation of the sparse index tracking model and algorithm, 
we hrst review the index tracking model from the regression viewpoint [30], then 
we provide a general acconnt of the L 1/2 regnlarization and the half thresholding 
algorithm which serve the basis of the new model and algorithm. 

2.1 Index tracing problem 

In this snbsection we give the index tracking model by introdncing the tracking er¬ 
ror which is treated as the objective fnnction and constraints of the index tracking 
problem from the the regression viewpoint. 

The tracking error have many different dehnitions, conseqnently, different tracking 
portfolio models are introdnced, see [21 Ea EH HEIES]. Most of them introdnce the 
dehnition of tracking error based either on correlations between the retnrns of tracking 
portfolio and the index or on estimates of the variance between the retnrns of the 
index and the retnrns of the tracking portfolio [HI EE]. However, Beasley et ah argne 
against the use of variance as a measure of tracking error because the tracking error 
would be zero while the difference between the return of the index and the tracking 
portfolio is constant [1]. This is the undesirable result because it does not take into 
account the tracking bias. Beasley et ah give a new dehnition of tracking error, that 
is, the square of mean squared error to measure the difference between the return of 
the target index and the tracking portfolio [1], this dehnition of the tracking error 
takes into account the bias of the tracking portfolio. Consequently, we adopt this 
dehnition of tracking error in this paper. 

Let Pit be the time series of stock prices for the N stocks that are included in 
the given stock market index whose evolution we wish to replicate. Let I{t) be the 
time series of this index. All time series are dehned for equally spaced intervals 
t = 1, 2,..., T. Under hxed mixture strategy, the tracking error is dehned 


T N 




( 2 . 1 ) 


t=l i=l 


where 


Tit : the return rate of stock i at time t during single period, that is 



( 2 . 2 ) 
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Rl : the return rate of the target index at time t during single period, that is 

= t = (2.3) 

Wi : the weights of stock i. 


Let 

R^ = {Ri,Ri--- ,Rfpf e R^^^ 

the column vector of the index return rate, and 
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where Ri = {rii,ri 2 ,--- ,riT)'^ is the column vector of return rate of the stock i , 
i = 1, 2, • • • , A^. i? is the matrix of all stock’s return rate. Let w = {wi, W 2 , • • • , wn)'^ 
be the iV x 1 column vector of the stock weights, the tracking error fl2.ip can be 
replaced as 

TE = ^\\Rw - R^Wl (2.4) 

The aim of index tracking problem is to hnd optimal tracking portfolio by mini¬ 
mizing the tracking error fl2.4p under some constraints. 

The hrst constraints of index tracking model is the budget constraints 

N 

= 1 , 

i=l 

it ensures that all the capital is invested in the tracking portfolio. The second item 
is the lower and upper bound constraints 


ViZi <Wi< Zi6i, i = !,■■■ ,N. 


(2.5) 


The aim of setting lower bounds of the investment ratio Wi is to avoid small invest 
volume, and setting upper bounds is to control risk. The third item is the cardinality 
constraints 

N 

'^Zi = K, Zi = 0orl, i = (2.6) 

i=l 

where K is the number of the stocks included in a tracking portfolio which K is a 
given positive integer. fl2.6p reflects that if asset i is not included in the tracking 
portfolio then Zi = 0, otherwise, tCj = 0 by (12.51) . 
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Based on the definition of tracking error and the constraints, the basic index 
tracking model is described as |30] 

min ^\\Rw - R^Wl 

rp 

s.t. e w = 1 

i=l 

Zi = 0 or 1, i = 1, 2, ■ ■ ■ , N. 

The above index tracking problem is ERM model, the optimal tracking portfolio 
obtained fl2.7p has the minimal in-sample tracking error, but we can’t know the per¬ 
formance of out-of-sample error. The model fl2.7l) is hard to solve since the cardinally 

N 

constraint Zi = K) is discrete and therefore highly nonlinear, many different op- 

timization technique have been proposed to tackle such a hybrid nonlinear integer 
programming, see laiiiEQ]. 

Different to the optimization technique, we hope to give the new sparse index 
tracking model which is easily solved by introducing the L 1/2 regularization. The 
new model can generate the sparse solution with good tracking performance both in- 
sample and out-of-sample. For constructing our model and solving it efficiently, we 
review the L 1/2 regularization and half thresholding algorithm in the next subsection. 

2.2 Li /2 regularization 

In this subsection, we briefly introduce the L 1/2 regularization and the half thresh¬ 
olding algorithm [3^ l33| l3l] . and explain why we use L 1/2 regularization for solving 
the index tracking problem. 

Li /2 regularization is one of the statistical regularization methods that is used to 
solve the sparse problem, which aims to find sparse solution of a representation or 
an equation. Typically, the sparsity problems include those of variable selection [3T] , 
visual coding [211 |2T], graphical modeling [23], error correction [9], matrix completion 
[8] and compressed sensing [33l HH lOl [14] . 

Li /2 regularization can be modeled as the following optimization problem 

mm\\Ax-bf + X\\x\\\^ll, (2.8) 

xeR^ ' 

n 

where A e x = (xi, ■ ■ ■ ,xmY ^ ^ ll^lli /2 ^ X] is the regulariza- 

' i=l 

tion parameter which control the sparsity of optimal solution fl2.8p . 
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In general, Lq regularization and Li regularization are also the efficient method 
for solving the sparsy problem. Lq regularization is 

mhi px - 6|p + A||x||o (2.9) 

where ||x||o means the number of nonzero components m x . Li regularization is 

mm\\Ax-bf + X\\x\\i (2.10) 

where ||a:||i means the 1-norm of x. 

Unfortunately, Lq regularization fl2.9p is NP-hard and hardly tractable when x 
is large. Li regularization fl2.10p . known as the Lasso, have been introduced in the 
ninthes by Tibshirani j2I] and it has also been independently proposed by Chen 
et al [12] as the basis pursuit denoising problem. Li regularization is the convex 
optimization problem and have the analytic solution. However, although the Li 
regularization provides the best convex approximation to the Lq regularization and 
is computationally efficient, the Li regularization cannot handle collinearity and may 
result in inconsistent selection and introduce extra bias in estimation. One valid 
improved method is to use L 1/2 regularization, the L 1/2 regularization can generate 
the more sparse solutions than Li regularization. 

Though Li /2 regularization fl2.8l) leads to a nonconvex, non-smooth and non- 
Lipschitz optimization problem, our recent studieds [32l[33l[3l| dissolved this problem. 
Through justifying the existence of the resolvent of gradient of penalty, and looking 
for its analytic expression, we derived an iterative half thresholding algorithm [33| 
for fast solution of I 1/2 regularization. We prove an alternative feature theorem on 
solutions of Li /2 regularization, based on which a thresholding representation of L 1/2 
regularization is given and a novel regularization parameter setting strategy is sug¬ 
gested. We verify the convergence of the iterative half thresholding algorithm and 
provide a series of experiments and applications to assess performance of the algo¬ 
rithm. 

The half thresholding algorithm can be described as 

^n+l H\^^^ \l2{Xn T (b AXn)')- 

where LTa^, 1 / 2 ( 2 :) = {hx^^i/ 2 {xi), hx^^i/ 2 {x 2 ), ■ ■ ■ , hxf,,i/ 2 {xN)) is the half thresholding 
operator, for i = 1, • • • , 

V.,2fe) = | + 

[ 0, otherwise 

and ^ 

cos<pA(a:i) = ^ . (2.12) 
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It is known that the qnantity of solntions of a regnlarization problem depends seriously 
on the setting of regularization parameter A. The selection of proper regularization pa¬ 
rameters is, however, a very hard problem. However, If the solutions of problem (12.811 
are it'-sparsity, the parameters can be found by formulating an optimality condition 
on regularization. Let B^{xn) = I + — Axn), and [B^{xn)\k+i is the {K -|- l)-th 

largest component of B^{xn) in magnitude, the parameters can be adopted with 


A 


n 


9/io 


3 

[-S/XO (2^n)]fc+l I ) 


and the constant /i„ = po > 0 . 

In fact, the problem of tracking a hnancial index using only a subset of stocks can 
be regarded as the sparsity problem. If the number of stocks included in the track¬ 
ing portfolio is hxed, the problem of selecting the optimal K stocks is iL-sparsity 
problem, and then can be handled by the L 1/2 regularization. Different to use Li 
regularization, the index tracking problem using L 1/2 regularization can provide more 
sparse tracking portfolio. Furthermore, the L 1/2 regularization has the fast and effi¬ 
cient algorithm with the better method of regularization parameter selection for the 
/L-sparsity problem. Hence, the new formulation of index tracking problem using 
Li /2 regularization are presented in next section. 


3 A Li /2 regularization based model 

In this section, we propose a new sparse index tracking model and the hybrid half 
thresholding algorithm based on the analysis of section 2 . 

3.1 The sparse index tracing model 

Considering the constraints of the index tracking model (12.7p introduced in section 
2, Let 


N 

Hi = {w\Zir]i <Wi< Zi6i, '^Zi = K, Zi = 0 or 1, i = 1, - ■ ■ ,N,} 

i=l 

N 

the constraints Zi = K, Zi = 0 or 1, i = 1, ■ ■ ■ , N means the number of nonzero 

i=l 

components of the optimal tracking weight w is K, and the tracking weight tCi = 0 if 
Zi = 0 or rji < Wi < 5i if Zj = 1 for i = 1, • • ■ , iV. We also notice that ||tc||o means 
the number of nonzero components of the optimal tracking weight w, then let 

D 2 = {'w^lll'w^llo == 0 or rii<Wi<Si, i = 

Let Supp{w) be the support set of w, i.e., Supp{w) = {i\wi 7 ^ 0}, a sparse, stable 
index tracking model is obtained by adopting a regularization procedure, that is, the 







constraint set is replaced by 172, so the sparse index tracking model based on the 
/o regularization is 


min ^\\Rw-R^\\l 
s.t. e w = 1 
Iloilo = K 

i ^ Supp{w) 
Wi = 0, i ^ Suppiw) 


(3.1) 


Based on the analysis of the subsection 2.2, it is better that we use ||tr '||^^2 = ^ 
to substitute the constraint ||ta||o = K, and we omit the coefficient ^ of the objective 
function. Then a new sparse index tracking model based on L 1/2 regularization is 
proposed as 


min \\Rw — R^\\l 

rp 

S.t. e w = 1 

ik’ii;/2= i< (3-2) 

Pi < Wi < 6i, i G Suppiw) 
tCj = 0, i ^ Suppiw) 


111 /2 

In order to solve this model efficiently, we penalize the constraint ||tc||i /2 = -^ fo 
the objective function using penalty function method, then an equivalent model can 
be obtained as follows 

min + A||tc|| 

s.t. e^w = 1 

Pi < Wi < 6i, i G Suppiw) 

Wi = 0, i ^ Suppiw) 

where A is the regularization parameter, setting A = 00 produces the totally con¬ 
strained solution (K=0) whereas A = 0 yields the unrestricted solution. 

Minimization of L 1/2 constraints is now a widely used technique when sparse 
solutions are desirable. In index tracking problem, sparsity also play a key role in 
the task of formulating tracking portfolio. In practice, managers often want to limit 
the number of assets or the proportion of investment the tracking problem. Then 
the index tracking problem can be regraded as sparse problem. Fortunately, the new 
model fl3.3p can provide a sparse solution by controlling the parameters A. 

Furthermore, we dehne the two indicators to test the consisitency and out-of- 
sample prediction error of the sparse index tracking model. The consisitency of the 
model is dehned by the absolute difference value of the error between in-sample and 
the out-of-sample. The smaller consisitency means the higher consisitency of the 


(3.3) 


9 



index tracking model. Empirical tests given in section 4 show that onr index tracking 
model has high consisitency, so it performs well both in-sample and ont-of-sample. 
Next we give the remark to show the index tracking model based on Li regnlar- 


ization. 

Remark 1. As Brodie et al and his paper say, Li regnlarization can be used 
to solve the index tracking problem with short selling constraints [5]. without loss 
of generality, we consider the no-short selling index tracking problem in this paper. 
Next we give the index tracking model by using Li regularization, that is 


min — R^|| 2 -|-A||tc||i 

rp 

s.t. e w = 1 

r]i < Wi < 6i, i E Supp{w) 
Wi = 0, i ^ Supp{w). 


(3.4) 


The next subsection we will give the efficient algortihm to solve the above model. 


3.2 A hybrid half thresholding algorithm 


As previously discussed, the half thresholding algorithm in the subsection 2.2 is to 
solve the L 1/2 regularization without any constrains, but the model fl3.3p is the L 1/2 
regularization with the convex constrains. In this subsection we propose a hybrid half 
algorithm to solve the model fl3.3p . 

The hybrid half thresholding algorithm is divided two steps, which is to handle 
separately the L 1/2 regularization of selecting the support set and the quadratic op¬ 
timization problem that consists in Ending the optimal asset weights for the fixed 
K stocks. We first consider the unconstrained case, i.e. the minimization of the 
objection function of model fl3.3l) . and then discuss how to deal with the constraints. 

In the first step, we discuss the algorithm for minimizing the objective function 
of the index tracking model fl3.3l) . that is 

min llRw — i?^|| 2 -|-A||ta|||^ 2 - (3-5) 

Clearly, the model (13.51) can be regarded as the L 1/2 regularization if the parameter 
A and b in L 1/2 regularization are replaced by the parameter R and . Suppose the 
w"' is current iterate point, an iteration 

Wn+l = {R^ “ RWn))- 


can be naturally defined, which is called half thresholding algorithm for L 1/2 reg¬ 
ularization. Furthermore, If we need K stocks to track the object index, i.e. the 
model 03.51) can be regarded as the iC-sparsity problem. Incorporated with different 
parameter-setting strategies in [M], the parameters are adopted by 


l^n hO) Aji min{A^_ 




1 ) 


\\Rr\[B,M]K^i\-^h 
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where 


1 — e 

with any small e G (0,1), B^^{wn) = Wn + — Rwn), and K. When so doing, 

an iteration algorithm will be adaptive, and free from the choice of regularization 
parameter. 

In the second step, through selecting the support set of the tracking portfolio w, 
we have the optimal asset weights Wi = 0 if i ^ Supp{w). The nonzero optimal asset 
weights can be solved by the following quadratic programming 

min \\Rw — R^\\l 

s.t. e^w = 1 (3.6) 

Pi <Wi< 6i, i G Suppiw). 

where R G R^^^ is the corresponding returns matrix of stocks with the nonzero 
weights. There exist very efficient algorithm to solve the above mode hl3.6l) . In this 
paper we adopt the Matlab function (quadprog) to solve it. 

Remark 2. Consider the index tracking problem using Li regularization fl3.4p . 
we can design the similar algorithm called to the hybrid half thresholding algorithm. 
The hybrid LARS algorithm is divided two parts. First, the Least Angle Regression 
or LARS [15] are used to solve the following problem, 

min llRtc —/2'^||2 + A||t(;||i. (3.7) 

The algorithm seeks to solve the above model for a range of value of regularization 
parameter A, starting from a very large value, and gradually decreasing A until the 
desired value is attainted. As A evolves, the optimal solution moves on a piecewise 
affine path. As such, to hnd the needed tracking portfolio with the K nonzero asset 
weights. Next the nonzero optimal asset weights can be solved by the same quadratic 
programming fl3.6l) . 

4 Empirical results 

In this section, we apply sparse index tracking model (ref:33) and hybrid half thresh¬ 
olding algorithm described above to conduct optimal tracking portfolios and evaluate 
their out-of-sample performance and consistency. 

The empirical comparisons are conducted on benchmark problems from the OR- 
Library (Beasley. [3]). For the index tracking problem it contains the weekly stock 
prices of the stocks included in major world market indexes, more specihcally, we con¬ 
sider Hang Seng (Hong Kong), DAX 100 (Germany), FTSE (Great Britain), Standard 
and Poor’s 100 (USA), the Nikkei index (Japan), the Standard and Poor’s 500 (USA), 
Russell 2000 (USA) and Russell 3000 (USA). 

The purpose of empirical tests is to assess the out-of-sample performance of the 
sparse index tracking model and hybrid half thresholding algorithm. To compare 
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performance, two competitive model and algorithm like Torrubiano’s model [30] with 
hybrid optimization approach, Li model fl3.4p with hybrid LARS approach have been 
also applied, together with our model and hybrid half thresholding algorithm. Similar 
to the experiments that were carried out by Torrubiano et al [30], the data sets of 
weekly returns of the stocks included in the index are partitioned into a training set 
containing the hrst half of the data (145 values) and a test set with the rest of the data 
(145 values). The training data sets are used to hnd the optimal tracking portfolio, 
and the testing data sets are used to estimate the out-of-sample tracking error of 
the tracking portfolio. The in-sample and out-of-sample tracking error marked as 
T El and TEO respectively. To compare the consistency both in-sample and out-of- 
sample and out-of -sample performance of our model with Torrubiano’s model [30] and 
Li model, we dehne the following indicators, 

• Consistency {Cons)\ This indicator is used to measure the consistency of the 
model both in and out of the sample, dehned as 


Cons = \TEI-TEO\. 


Clearly, the smaller value of the Cons means that the model is more stable both 
in-sample and out-of-sample. 

• Superiority of out-of-sample (SupO): We dehne 


SupO 


TEOl -TE02 
TEOl 


X 100%, 


where TEOl and TE02 are the out-of-sample tracking error of model 1 and our 
mode 2. If SupO > 0, namely, TE02 is smaller than TEOl, i.e. model 2 has the 
better out-of-sample error than model 1. 

The tests were conducted on a personal computer (2.67Ghz, 4Gb of RAM) with 
MATLAB 7.9 programming platform (R2009b). The lower and upper bound of the 
asset weight set to rji = 0.01 and 5* = 0.5, i = 1, 2, • • • , iV. 


A. Comparison with Torrubiano’s model 

We present experiments to compare the performance of our model and Torru¬ 
biano’s model by using Hang Seng (Hong Kong), DAX 100 (Germany), FTSE (Great 
Britain), Standard and Poor’s 100 (USA), the Nikkei index (Japan). The in-sample 
error and the out-of-sample error of Torrubiano’s model are cited in [SU] , Results for 
hve data sets are summarized in Table [T] and Figure 1. 

From the tabled] we see that: 

• Our model has lower out-of-sample prediction error since SupO > 0 at 80%(= 
24/30). But this is not necessarily the case in the training sets, the emphasis on our 
model is to improve the tracking error in testing data sets for higher prediction effect. 
We take the FISE index as the example. In the Figure 1, the in-sample error and 
the out-of sample error of our model is in the middle position, i.e. our model has 
the better out-of sample error than the Torrubiano’s model at the cost of in-sample 
error. 
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Table 1: Comparison Torrubiano’s model with our model 


Index Scale Our model Torrubiano’s model SupO{%) 



K 

TEI2 

TE02 

Cons2 

TEIl 

TEOl 

Consl 


Hang 

5 

5.81e-5 

4.19e-5 

1.62e-5 

4.14e-5 

7.22e-5 

3.08e-5 

41.91 

Seng 

6 

5.01e-5 

3.85e-5 

1.16e-5 

3.031e-5 

4.76e-5 

1.724e-5 

19.03 

1V=31 

7 

3.56e-5 

2.62e-5 

9.38e-6 

2.37e-5 

3.81e-5 

1.44e-5 

31.15 


8 

2.61e-5 

2.02e-5 

5.92e-6 

1.91e-5 

2.90e-5 

9.92e-6 

30.36 


9 

2.31e-5 

1.63e-5 

6.77e-6 

1.62e-5 

2.58e-5 

9.59e-6 

36.85 


10 

1.84e-5 

1.64e-5 

2.07e-6 

1.35e-5 

2.06e-5 

7.11e-6 

20.36 

DAX 

5 

4.57e-5 

1.20e-4 

7.40e-5 

2.21e-5 

1.02e-4 

7.97e-5 

- 17.58 

N=85 

6 

3.30e-5 

8.78e-5 

5.47e-5 

1.76e-5 

8.94e-5 

7.17e-5 

1.79 


7 

2.41e-5 

9.80e-5 

7.39e-5 

1.37e-5 

8.46e-5 

7.09e-5 

- 15.83 


8 

2.14e-5 

8.97e-5 

6.83e-5 

l.lle-5 

7.93e-5 

6.82e-5 

- 13.08 


9 

1.94e-5 

8.80e-5 

6.86e-5 

9.22e-6 

7.78e-5 

6.85e-5 

- 13.14 


10 

2.96e-5 

2.90e-5 

5.68e-5 

8.08e-6 

7.48e-5 

6.67e-5 

61.22 

FTSE 

5 

1.14e-4 

9.01e-5 

2.37e-5 

6.42e-5 

1.58e-4 

9.39e-5 

43.00 

N=89 

6 

8.30e-5 

8.68e-5 

3.72e-6 

4.96e-5 

1.12e-4 

6.23e-5 

22.47 


7 

7.91e-5 

7.42e-5 

4.87e-6 

3.83e-5 

9.07e-5 

5.24e-5 

18.15 


8 

6.24e-5 

6.72e-5 

4.83e-6 

2.90e-5 

9.66e-5 

6.76e-5 

30.45 


9 

5.60e-5 

5.64e-5 

6.19e-6 

2.49e-5 

8.59e-5 

6.11e-5 

34.41 


10 

4.30e-5 

4.92e-5 

6.19e-6 

2.18e-5 

8.01e-5 

5.82e-5 

38.54 

S&P 

5 

1.21e-4 

1.09e-4 

1.06e-5 

4.50e-5 

1.14e-4 

6.92e-5 

3.72 

N=98 

6 

6.80e-5 

8.30e-5 

1.50e-5 

3.37e-5 

l.Ole-4 

6.70e-5 

17.61 


7 

8.72e-5 

8.33e-5 

3.88e-6 

2.76e-5 

7.80e-5 

5.04e-5 

- 6.80 


8 

3.89e-5 

5.98e-5 

2.08e-5 

2.27e-5 

6.76e-5 

4.49e-5 

11.66 


9 

7.42e-5 

4.90e-5 

2.52e-5 

1.94e-5 

5.91e-5 

3.97e-5 

17.05 


10 

3.99e-5 

4.22e-5 

2.25e-6 

1.66e-5 

5.55e-5 

3.89e-5 

23.96 

Nikkei 

5 

1.26e-4 

1.58e-4 

3.19e-5 

5.46e-5 

1.63e-4 

1.08e-4 

2.87 

N=225 

6 

1.15e-4 

1.41e-4 

2.58e-5 

4.01e-5 

1.47-4 

1.07e-4 

3.93 


7 

8.81e-5 

1.21e-4 

3.38e-5 

3.36e-5 

1.32e-4 

9.88e-5 

7.93 


8 

5.94e-5 

9.34e-5 

3.40e-5 

2.60e-5 

l.lOe-4 

8.40e-5 

15.08 


9 

5.96e-5 

8.14e-5 

2.18e-5 

2.13e-5 

9.80e-5 

1.68e-5 

17.01 


10 

7.08e-5 

6.96e-5 

1.29e-6 

1.80e-5 

6.47e-5 

4.67e-5 

- 7.49 


• Our model is more stable than Torrubiano’s model since the consistence indi¬ 
cator Cons2 is smaller than Consl. 

B. Comparison with Li model 

Next we conduct experiments to compare the performance of our model using 
hybrid half thresholding algorithm and Li model using hybrid LARS algorithm. Nu- 


13 






Figure 1: Comparsion of Torrubiano model and Li/2 model based on FISE index 

merical results are listed in Table [2] and Figure 2. 

From the Table 2 we see that our model has lower out-of-sample prediction error 
than Li model since SupO > 0 at 87%(= 26/30). Moreover, we hnd our model 
can provide more sparse solution to track the object index. The Figure 2 shows this 
results. The out-of-sample prediction error of Li model using iF = 10 stocks is the 
same to our model using K = 5 stocks. 



Figure 2: Comparsion of Li model and L\j2 model based on Hang Seng index 

Finally, we discuss the large scale index tracking problem, i.e.. Standard and 
Poor’s 500(iV = 457), Russell 2000 (iV = 1318) and Russell 3000(A^ = 2151). Since 
the number of stocks included in indexes is very large, we select the number of the 
tracking stocks K = 10, 20, 30,40, 50, the numerical results are listed in Table |3l 
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Table 2: Comparison Li model with our model 


Index 

Scale 

K 

Our model 

TEI2 TE02 Cons2 

TEIl 

LI model 

TEOl 

Consl 

SupO(%) 

Hang 

5 

5.81e-5 

4.19e-5 

1.62e-5 

9.65e-5 

9.35e-5 

2.96e-6 

55.15 

Seng 

6 

5.01e-5 

3.85e-5 

1.16e-5 

5.47e-5 

7.09e-5 

1.62e-5 

45.73 

II 

CO 

7 

3.56e-5 

2.62e-5 

9.38e-6 

4.74e-5 

5.64e-5 

9.06e-6 

53.50 


8 

2.61e-5 

2.02e-5 

5.92e-6 

4.59e-5 

5.95e-5 

1.36e-5 

66.05 


9 

2.31e-5 

1.63e-5 

6.77e-6 

4.41e-5 

5.13e-5 

7.13e-6 

68.20 


10 

1.84e-5 

1.64e-5 

2.07e-6 

4.02e-5 

4.20e-5 

1.79e-6 

61.01 

DAX 

5 

4.57e-5 

1.20e-4 

7.40e-5 

3.26e-5 

1.22e-4 

8.94e-5 

1.86 

N=85 

6 

3.30e-5 

8.78e-5 

5.47e-5 

2.25e-5 

9.89e-5 

7.64e-5 

11.24 


7 

2.41e-5 

9.80e-5 

7.39e-5 

1.66e-5 

8.57e-5 

6.90e-5 

- 14.37 


8 

2.14e-5 

8.97e-5 

6.83e-5 

1.60e-5 

8.04e-5 

6.44e-5 

- 11.48 


9 

1.94e-5 

8.80e-5 

6.86e-5 

1.49e-5 

7.81e-5 

6.32e-5 

- 12.62 


10 

2.96e-5 

2.90e-5 

5.68e-5 

1.43e-5 

7.81e-5 

6.37e-5 

62.82 

FTSE 

5 

1.14e-4 

9.01e-5 

2.37e-5 

1.06e-5 

1.33e-4 

2.67e-5 

32.29 

00 

II 

6 

8.30e-5 

8.68e-5 

3.72e-6 

9.94e-5 

1.18e-4 

1.83e-5 

26.31 


7 

7.91e-5 

7.42e-5 

4.87e-6 

8.78e-5 

1.14e-4 

2.57e-5 

34.63 


8 

6.24e-5 

6.72e-5 

4.83e-6 

7.61e-5 

1.16e-4 

4.02e-5 

42.19 


9 

5.60e-5 

5.64e-5 

6.19e-6 

5.62e-5 

9.40e-5 

3.41e-5 

39.95 


10 

4.30e-5 

4.92e-5 

6.19e-6 

5.34e-5 

8.75e-5 

3.41e-5 

43.74 

S&P 

5 

1.21e-4 

1.09e-4 

1.06e-5 

l.Ole-4 

1.26e-4 

2.44e-5 

12.39 

N=98 

6 

6.80e-5 

8.30e-5 

1.50e-5 

8.15e-5 

9.26e-4 

l.lOe-5 

10.36 


7 

8.72e-5 

8.33e-5 

3.88e-6 

5.56e-5 

7.51e-5 

1.95e-5 

- 10.95 


8 

3.89e-5 

5.98e-5 

2.08e-5 

4.44e-5 

6.80e-5 

2.36e-5 

12.13 


9 

7.42e-5 

4.90e-5 

2.52e-5 

4.27e-5 

5.98e-5 

1.71e-5 

18.00 


10 

3.99e-5 

4.22e-5 

2.25e-6 

4.22e-5 

5.73e-5 

1.51e-5 

26.39 

Nikkei 

5 

1.26e-4 

1.58e-4 

3.19e-5 

1.48e-5 

2.10e-4 

6.24e-5 

24.72 

N=225 

6 

1.15e-4 

1.41e-4 

2.58e-5 

1.31e-5 

2.20-4 

8.93e-5 

35.87 


7 

8.81e-5 

1.21e-4 

3.38e-5 

1.18e-5 

1.82e-4 

6.39e-5 

32.85 


8 

5.94e-5 

9.34e-5 

3.40e-5 

1.08e-5 

1.66e-4 

5.83e-5 

43.69 


9 

5.96e-5 

8.14e-5 

2.18e-5 

9.89e-5 

1.62e-4 

6.29e-5 

49.92 


10 

7.08e-5 

6.96e-5 

1.29e-6 

9.47e-5 

1.59e-4 

6.42e-5 

56.25 


According to the Table [3l the value SupO > 0 for all cases, it is shown that our 
model has better out-of-sample prediction error than Li model. 
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Table 3: Comparison Li model with our model 


Index 

Scale 

K 

Our model 

TEI2 TE02 Cons2 

TEIl 

LI model 

TEOl 

Consl 

SupO{%) 

S&P 

10 

1.49e-4 

4.70e-4 

2.92e-4 

1.08e-4 

3.44e-4 

2.36e-4 

26.88 


20 

7.93e-5 

2.76e-4 

1.97e-4 

3.27e-5 

1.73e-4 

1.41e-4 

37.18 

iV=457 

30 

5.03e-5 

2.23e-4 

1.72e-4 

3.78e-5 

1.61e-4 

1.23e-4 

27.69 


40 

3.42e-5 

1.68e-4 

1.34e-4 

3.81e-5 

l.lOe-4 

7.17e-5 

34.78 


50 

2.57e-5 

1.42e-4 

1.16e-4 

4.18e-5 

1.14e-4 

7.27e-5 

19.17 

Russell 

10 

4.62e-4 

5.98e-4 

1.37e-4 

2.29e-4 

5.77e-4 

3.48e-4 

3.52 

iV=1318 

20 

1.56e-4 

4.34e-4 

2.78e-4 

1.20e-4 

3.83e-4 

2.63e-4 

11.86 


30 

1.06e-4 

4.08e-4 

3.03e-4 

1.30e-4 

3.23e-4 

1.93e-4 

20.94 


40 

5.48e-5 

3.20e-4 

2.66e-4 

7.89e-5 

2.32e-4 

1.53e-4 

27.68 


50 

5.22e-5 

2.89e-4 

2.36e-4 

9.78e-5 

2.62e-4 

1.65e-4 

9.06 

Russell 

10 

1.26e-4 

4.91e-4 

3.65e-4 

3.78e-4 

3.97e-4 

1.94e-5 

19.14 

1V=2151 

20 

7.44e-5 

3.09e-4 

2.34e-4 

1.22e-4 

2.37e-4 

1.15e-4 

23.26 


30 

3.88e-5 

2.37e-4 

1.98e-4 

1.27e-4 

2.28e-4 

l.OOe-4 

3.96 


40 

3.39e-5 

2.07e-4 

1.73e-4 

8.45e-5 

1.06e-4 

1.22e-4 

0.44 


50 

3.71e-5 

1.69e-4 

1.32e-4 

1.31e-4 

1.67e-4 

3.56e-5 

1.44 


5 Conclusions 

Index tracking is a passive financial strategy that aims at replicating the performance 
and risk-profile of a given index. One of the most common approaches to tackle the 
index tracking problem consists of minimizing a given tracking error measure while 
limiting the maximum number of assets held in the portfolio. Having few active po¬ 
sitions reduces the administrative and transaction costs and avoids detaining very 
small and illiquid positions, especially when the index has a large number of con¬ 
stituents. However, imposing an upper bound on the number of constituents of the 
tracking portfolio makes the optimization problem NP-Hard. Different quantitative 
approaches have been proposed to tackle such an optimization problem. Most ap¬ 
proaches rely on search heuristics. On the other hand, Li regularization methods have 
found application in mean-variance portfolio settings in order to promote the iden¬ 
tification of sparse portfolios with good out-of-sample properties and low turnover. 
However, the Li regularization approach is ineffective in index tracking problem, since 
the index tracking problem has budget and no-short selling constraints. 

In this paper We have used a new constrains ||tc ||;^/2 ~ tracking portfolio’s 

weight to replace the cardinality constrains ||tc||o = K which equals to = 

K, Zj = 0 or 1. A new sparse index tracking model was established by minimizing 
tracking error. Different to the other models of stock index tracking, our model 
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has high consistency and out-of-sample prediction error, it also can preserve sparsity 
of the optimal tracking portfolio as much as possible. Meanwhile, since the half 
threshoding algorithm is the fast solver of L 1/2 regularization, we have extended 
the half threshoding algorithm to hybrid half thresholding algorithm for solving the 
proposed index tracking model. The algorithm is fast and efficient with appropriate 
parameters selection for the sparse index tracking model. Furthermore, it is simple, 
very convenient in use, and can be applied to large scale problems. 

We have tested performance of our model and algorithm on the eight data sets 
from OR-library. Numerical results have shown that the sparse index tracking model 
and hybrid half thresholding algorithm have high consisitency and better out-of- 
sample prediction ability. We believe the sparse index tracking model and projected 
half thresholding algorithm can provide useful reference to the manager of index 
derivatives. Next we plan to extend our results to the index tracking problems with 
transaction costs. 
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